Search CORE

77 research outputs found

Some intriguing properties of Tukey's half-space depth

Author: Chaudhuri Probal
Dutta Subhajit
Ghosh Anil K.
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2011
Field of study

For multivariate data, Tukey's half-space depth is one of the most popular depth functions available in the literature. It is conceptually simple and satisfies several desirable properties of depth functions. The Tukey median, the multivariate median associated with the half-space depth, is also a well-known measure of center for multivariate data with several interesting properties. In this article, we derive and investigate some interesting properties of half-space depth and its associated multivariate median. These properties, some of which are counterintuitive, have important statistical consequences in multivariate analysis. We also investigate a natural extension of Tukey's half-space depth and the related median for probability distributions on any Banach space (which may be finite- or infinite-dimensional) and prove some results that demonstrate anomalous behavior of half-space depth in infinite-dimensional spaces.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ322 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

CiteSeerX

Crossref

Swords: a statistical tool for analysing large DNA sequences

Author: Chaudhuri Probal
Das Sandip
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2002
Field of study

In this article, we present some simple yet effective statistical techniques for analysing and comparing large DNA sequences. These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called swords. Using sequences available in public domain databases housed in the Internet, we demonstrate how swords can be conveniently used by molecular biologists and geneticists to unmask biologically important features hidden in large sequences and assess their statistical significance

On estimators of the mean of infinite dimensional data in finite populations

Author: Chaudhuri Probal
Dey Anurag
Publication venue
Publication date: 24/05/2023
Field of study

The Horvitz-Thompson (HT), the Rao-Hartley-Cochran (RHC) and the generalized regression (GREG) estimators of the finite population mean are considered, when the observations are from an infinite dimensional space. We compare these estimators based on their asymptotic distributions under some commonly used sampling designs and some superpopulations satisfying linear regression models. We show that the GREG estimator is asymptotically at least as efficient as any of the other two estimators under different sampling designs considered in this paper. Further, we show that the use of some well known sampling designs utilizing auxiliary information may have an adverse effect on the performance of the GREG estimator, when the degree of heteroscedasticity present in linear regression models is not very large. On the other hand, the use of those sampling designs improves the performance of this estimator, when the degree of heteroscedasticity present in linear regression models is large. We develop methods for determining the degree of heteroscedasticity, which in turn determines the choice of appropriate sampling design to be used with the GREG estimator. We also investigate the consistency of the covariance operators of the above estimators. We carry out some numerical studies using real and synthetic data, and our theoretical results are supported by the results obtained from those numerical studies

arXiv.org e-Print Archive